This repository contains two datasets:

1) semantic shifts (ground-truth&annotation): This is a dataset for evaluating methods for detecting semantic shifts over viewpoints. The dataset is in csv format (separated by “;”). Each line contains a concept, the ground-truth, and the annotation made by four annotators. 0 and 1 correspond to two different viewpoints. 

2) summarization(ground-truth) contains the ground-truth summarization for a set of concepts produced by three different methods. summarization(annotation) contains the annotation made by 10 annotators. The annotators were asked to annotate the generated summarizes (in summarization(ground-truth).text) to before or after 911 categories. Both files are csv formatted with “;” as the separator. 

For more information please either:

1) Look at the following paper:
Hosein Azarbonyad, Mostafa Dehghani, Kaspar Beelen, Alexandra Arkut, Maarten Marx, and Jaap Kamps, “Words are Malleable: Computing Semantic Shifts in Political and Media Discourse”, in Proceedings of The ACM International Conference on Information and Knowledge Management (CIKM), 2017.

or 

2) Send an email to: h.azarbonyad@uva.nl